Prosodic vs. segmental contributions to naturalness in a diphone synthesizer

نویسندگان

  • H. Timothy Bunnell
  • Steven R. Hoskins
  • Debra Yarrington
چکیده

The relative contributions of segmental versus prosodic factors to the perceived naturalness of synthetic speech was measured by transplanting prosody between natural speech and the output of a diphone synthesizer. A small corpus was created containing matched sentence pairs wherein one member of the pair was a natural utterance and the other was a synthetic utterance generated with diphone data from the same talker. Two additional sentences were formed from each sentence pair by transplanting the prosodic structure between the natural and synthetic members of each pair. In two listening experiments subjects were asked to (a) classify each sentence as “natural” or “synthetic, or (b) rate the naturalness of each sentence. Results showed that the prosodic information was more important than segmental information in both classification and ratings of naturalness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prosody Analysis of L2 English for Naturalness Evaluation Through Speech Modification

This study investigates how different prosodic features affect native speakers' naturalness judgement of L2 English speech by Chinese students. Through subjective judgment by native speakers and objectively measured prosodic features, timing and pitch related prosodic features, as well as segmental goodness of pronunciation have been found to play key roles in native speakers' perception of nat...

متن کامل

On the Modelling of Prosodic Cues in Synthetic Speech – What are the Effects on Perceived Uncertainty and Naturalness?

In this paper we present work on the modelling of uncertainty by means of prosodic cues in an articulatory speech synthesizer. Our stimuli are embedded into short dialogues in question-answering situations in a human-machine scenario. The answers of the robot vary with respect to the intended level of (un)certainty, the independent variables are intonation (rising vs. falling) and filler (absen...

متن کامل

Prosodic Reading Style Simulation for Text-to-Speech Synthesis

The simulation of different reading styles (mainly by adapting prosodic parameters) can improve the naturalness of synthetic speech and supports a more intelligent human machine interaction. The article exemplarily investigates the reading styles News and Tale. For comparison, all examined texts contained the same genre-neutral paragraphs which have been read without a specific style instructio...

متن کامل

TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis

In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the Harmonic plus Noise Model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational lo...

متن کامل

Farsi language prosodic structure, research and implementation using a speech synthesizer

In this research, we have investigated about prosodic features of Farsi (Persian) language and quantified major stress rules and some intonation rules for speech synthesis purpose. The research is mostly concentrated on pitch variations and then on durational changes. We have implemented the proposed simplified prosodic rules using a Klatt formant synthesizer, specially modified for Farsi phone...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998